Asynchronous Distributed Neural Network Training using Alternating Direction Method of Multipliers
نویسندگان
چکیده
Since the first appearance of a large-scale dataset [4] and powerful computational resources such as GPUs, Convolutional Neural Networks(CNN) became the essential machine learning algorithm for image classification, detection, and many application. As the popularity of CNN increases, the size of CNN increased as well[10, 13]. (For instance, AlexNet has more than 60 milion parameters.) The performance of such large networks improves steadily yet training these networks takes longer. In this paper, we propose a novel asynchronous distributed neural network optimization using Alternating Direction Method of Multipliers(ADMM)[6]. Unlike previous works on distributed optimization for neural network training [1, 5, 3, 12] which rely soley on the primal optimization, we formulate the problem into a global consensus optimization and distribute the neural network training in a principled fashion. We evaluate our framework on CIFAR10 dataset and analyze the effect of hyperparameters introduced in ADMM.
منابع مشابه
Managing Photovoltaic Generation Effect On Voltage Profile Using Distributed Algorithm
In this paper, a distributed method for reactive power management in a distribution system has been presented. The proposed method focuses on the voltage rise where the distribution systems are equipped with a considerable number of photovoltaic units. This paper proposes the alternating direction method of multipliers (ADMMs) approach for solving the optimal voltage control problem in a distri...
متن کاملTraining Deep Neural Networks via Optimization Over Graphs
In this work, we propose to train a deep neural network by distributed optimization over a graph. Two nonlinear functions are considered: the rectified linear unit (ReLU) and a linear unit with both lower and upper cutoffs (DCutLU). The problem reformulation over a graph is realized by explicitly representing ReLU or DCutLU using a set of slack variables. We then apply the alternating direction...
متن کاملDistributed Voltage Control in Distribution Networks with High Penetration of Photovoltaic Systems
In this paper, a distributed method for reactive power management in a distribution system has been presented. The proposed method focuses on the voltage rise where the distribution systems are equipped with a considerable number of photovoltaic units. This paper proposes the alternating direction method of multipliers (ADMMs) approach for solving the optimal voltage control problem in a distri...
متن کاملCombining the benefits of function approximation and trajectory optimization
Neural networks have recently solved many hard problems in Machine Learning, but their impact in control remains limited. Trajectory optimization has recently solved many hard problems in robotic control, but using it online remains challenging. Here we leverage the high-fidelity solutions obtained by trajectory optimization to speed up the training of neural network controllers. The two learni...
متن کاملModified Convex Data Clustering Algorithm Based on Alternating Direction Method of Multipliers
Knowing the fact that the main weakness of the most standard methods including k-means and hierarchical data clustering is their sensitivity to initialization and trapping to local minima, this paper proposes a modification of convex data clustering in which there is no need to be peculiar about how to select initial values. Due to properly converting the task of optimization to an equivalent...
متن کامل